AITopics | South Korea

Collaborating Authors

South Korea

Exclusively Penalized Q-learning for Offline Reinforcement Learning Yonghyeon Jo Jungmo Kim Sanghyeon Lee Seungyul Han

Neural Information Processing SystemsJun-1-2025, 21:02:54 GMT

Constraint-based offline reinforcement learning (RL) involves policy constraints or imposing penalties on the value function to mitigate overestimation errors caused by distributional shift. This paper focuses on a limitation in existing offline RL methods with penalized value function, indicating the potential for underestimation bias due to unnecessary bias introduced in the value function. To address this concern, we propose Exclusively Penalized Q-learning (EPQ), which reduces estimation bias in the value function by selectively penalizing states that are prone to inducing estimation errors. Numerical results show that our method significantly reduces underestimation bias and improves performance in various offline control tasks compared to other offline RL methods.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Italy (0.14)
Asia > South Korea (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Nearly Minimax Optimal Regret for Multinomial Logistic Bandit

Neural Information Processing SystemsJun-1-2025, 14:33:59 GMT

In this paper, we study the contextual multinomial logistic (MNL) bandit problem in which a learning agent sequentially selects an assortment based on contextual information, and user feedback follows an MNL choice model. There has been a significant discrepancy between lower and upper regret bounds, particularly regarding the maximum assortment size K. Additionally, the variation in reward structures between these bounds complicates the quest for optimality. Under uniform rewards, where all items have the same expected reward, we establish a regret lower bound of Ωpd? T {Kq and propose a constant-time algorithm, OFU-MNL+, that achieves a matching upper bound of Õpd? T {Kq. We also provide instancedependent minimax regret bounds under uniform rewards. Under non-uniform rewards, we prove a lower bound of Ωpd? T q and an upper bound of Õpd? T q, also achievable by OFU-MNL+. Our empirical studies support these theoretical findings. To the best of our knowledge, this is the first work in the contextual MNL bandit literature to prove minimax optimality -- for either uniform or non-uniform reward setting -- and to propose a computationally efficient algorithm that achieves this optimality up to logarithmic factors.

artificial intelligence, inequality hold, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > South Korea (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Improved Regret of Linear Ensemble Sampling

Neural Information Processing SystemsMay-31-2025, 21:22:29 GMT

In this work, we close the fundamental gap of theory and practice by providing an improved regret bound for linear ensemble sampling.

artificial intelligence, data mining, machine learning, (21 more...)

Neural Information Processing Systems

Country: Asia > South Korea (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Bat-G net: Bat-inspired High-Resolution 3D Image Reconstruction using Ultrasonic Echoes

Gunpil Hwang, Seohyeon Kim, Hyeon-Min Bae

Neural Information Processing SystemsMay-31-2025, 17:39:09 GMT

In this paper, a bat-inspired high-resolution ultrasound 3D imaging system is presented. Live bats demonstrate that the properly used ultrasound can be used to perceive 3D space. With this in mind, a neural network referred to as a Bat-G network is implemented to reconstruct the 3D representation of target objects from the hyperbolic FM (HFM) chirped ultrasonic echoes. The Bat-G network consists of an encoder emulating a bat's central auditory pathway, and a 3D graphical visualization decoder. For the acquisition of the ultrasound data, a custom-made Bat-I sensor module is used. The Bat-G network shows the uniform 3D reconstruction results and achieves precision, recall, and F1-score of 0.896, 0.899, and 0.895, respectively. The experimental results demonstrate the implementation feasibility of a high-resolution non-optical sound-based imaging system being used by live bats.

artificial intelligence, bat-g network, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.14)
Asia > South Korea (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning Infinitesimal Generators of Continuous Symmetries from Data

Neural Information Processing SystemsMay-31-2025, 15:53:22 GMT

Exploiting symmetry inherent in data can significantly improve the sample efficiency of a learning procedure and the generalization of learned models. When data clearly reveals underlying symmetry, leveraging this symmetry can naturally inform the design of model architectures or learning strategies. Yet, in numerous real-world scenarios, identifying the specific symmetry within a given data distribution often proves ambiguous. To tackle this, some existing works learn symmetry in a data-driven manner, parameterizing and learning expected symmetry through data. However, these methods often rely on explicit knowledge, such as pre-defined Lie groups, which are typically restricted to linear or affine transformations.

artificial intelligence, machine learning, symmetry, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.14)
North America > Canada (0.14)
Asia > South Korea (0.14)
Africa > Rwanda (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)

Add feedback

Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation

Neural Information Processing SystemsMay-31-2025, 05:48:04 GMT

Reinforcement learning (RL) is a sequential decision-making problem in which an agent tries to maximize its expected cumulative reward by interacting with an unknown environment over time.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > South Korea (0.14)
Africa > Ethiopia (0.13)

Genre: Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.41)

Add feedback

Accelerating Value Iteration with Anchoring Jongmin Lee 1 Ernest K. Ryu Department of Mathematical Science, Seoul National University

Neural Information Processing SystemsMay-30-2025, 21:32:40 GMT

Surprisingly, however, the optimal rate in terms of Bellman error for the VI setup was not known, and finding a general acceleration mechanism has been an open problem. In this paper, we present the first accelerated VI for both the Bellman consistency and optimality operators. Our method, called Anc-VI, is based on an anchoring mechanism (distinct from Nesterov's acceleration), and it reduces the Bellman error faster than standard VI. In particular, Anc-VI exhibits a O(1/k)-rate for γ 1 or even γ = 1, while standard VI has rate O(1) for γ 1 1/k, where k is the iteration count. We also provide a complexity lower bound matching the upper bound up to a constant factor of 4, thereby establishing optimality of the accelerated rate of Anc-VI. Finally, we show that the anchoring mechanism provides the same benefit in the approximate VI and Gauss-Seidel VI setups as well.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: Asia > South Korea > Seoul > Seoul (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Queueing Matching Bandits with Preference Feedback

Neural Information Processing SystemsMay-30-2025, 03:08:05 GMT

In these systems, there are two sides: queues (agents) on one side and servers (arms) on the other.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > South Korea (0.14)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

Learning Symmetric Rules with SA TNet

Neural Information Processing SystemsMay-29-2025, 22:52:11 GMT

SA TNet is a differentiable constraint solver with a custom backpropagation algorithm, which can be used as a layer in a deep-learning system. It is a promising proposal for bridging deep learning and logical reasoning. In fact, SA TNet has been successfully applied to learn, among others, the rules of a complex logical puzzle, such as Sudoku, just from input and output pairs where inputs are given as images. In this paper, we show how to improve the learning of SA TNet by exploiting symmetries in the target rules of a given but unknown logical puzzle or more generally a logical formula. We present SymSA TNet, a variant of SA T - Net that translates the given symmetries of the target rules to a condition on the parameters of SA TNet and requires that the parameters should have a particular parametric form that guarantees the condition. The requirement dramatically reduces the number of parameters to learn for the rules with enough symmetries, and makes the parameter learning of SymSA TNet much easier than that of SA TNet.

artificial intelligence, machine learning, symmetry, (18 more...)

Neural Information Processing Systems

Country:

Asia > South Korea (0.14)
Africa > Senegal (0.14)

Industry: Leisure & Entertainment > Games (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Self-Guided Masked Autoencoder Jeongwoo Shin

Neural Information Processing SystemsMay-29-2025, 21:46:55 GMT

Masked Autoencoder (MAE) is a self-supervised approach for representation learning, widely applicable to a variety of downstream tasks in computer vision. In spite of its success, it is still not fully uncovered what and how MAE exactly learns. In this paper, with an in-depth analysis, we discover that MAE intrinsically learns pattern-based patch-level clustering from surprisingly early stages of pretraining. Upon this understanding, we propose self-guided masked autoencoder, which internally generates informed mask by utilizing its progress in patch clustering, substituting the naive random masking of the vanilla MAE. Our approach significantly boosts its learning process without relying on any external models or supplementary information, keeping the benefit of self-supervised nature of MAE intact. Comprehensive experiments on various downstream tasks verify the effectiveness of the proposed method.

artificial intelligence, information, machine learning, (17 more...)

Neural Information Processing Systems

Country: